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Quantum detectors provide information about the microscopic properties of quantum systems by 
establishing correlations between those properties and a set of macroscopically distinct events that 
we observe. The question of how much information a quantum detector can extract from a system 
is therefore of fundamental significance. In the present paper we address this question within a 
precise framework: given a measurement apparatus implementing a specific POVM measurement, 
what is the optimal performance achievable with it for a specific information readout task, and 
what is the optimal way to encode information in the quantum system in order to achieve this 
performance? We consider some of the most common information transmission tasks — the Bayes cost 
problem, unambiguous message discrimination, and the maximal mutual information. We provide 
general solutions to the Bayesian and unambiguous discrimination problems. We also show that the 
maximal mutual information is equal to the classical capacity of the quantum-to-classical channel 
describing the measurement, and study its properties in certain special cases. For a group covariant 
measurement, we show that the problem is equivalent to the problem of accessible information of 
a group covariant ensemble of states. We give analytical proofs of optimality in some relevant 
cases. The framework presented here provides a natural way to characterize generalized quantum 
measurements in terms of their information readout capabilities. 



I. INTRODUCTION 

Quantum detectors provide the interface between the 
microscopic world of quantum phenomena and the world 
of macroscopically distinct events that we observe. A 
quantum detector is a device that interacts with the sys- 
tem under observation in a way that establishes corre- 
lations between certain properties of the system and a 
set of macroscopically distinct (orthogonal) states of the 
device. A general quantum detector can be described by 
a positive operator- valued measure (POVM), i.e., a set 
of positive operators {Ei}, Ei > i — 1,...,M, sum- 
ming up to the identity, Ei = I. For an input state p, 
the probability that the measurement yields outcome j 
is given by the Born rule, Pj[p) = Tr{pEj}. 

A natural question is to what extent a given quan- 
tum detector is able to provide information about the 
system it is used to observe. This question can be con- 
veniently formulated in the context of a quantum com- 
munication scenario, where a sender (Alice) tries to send 
messages to a receiver (Bob) who is constrained to read 
those messages using the quantum detector in question. 
Concretely, let the source of classical information that 
Alice wants to communicate to Bob be characterized by 
a probability distribution tt^ > 0, i = 1, N , tt^ — 1, 
that specifies the probability of each classical message i. 
Alice encodes the different messages into quantum states 



via an encoding map i ^ pi, and Bob reads the infor- 
mation by performing the POVM measurement. If there 
are no constraints on the way Alice can prepare the sig- 
nal states and these states can reach Bob undisturbed 
(i.e., Alice and Bob are connected through a noiseless 
channel) , then the optimal performance they can achieve 
for a given task can be regarded as quantifying the read- 
out capabilities of the measurement with respect to that 
task. In this respect, a problem of primary importance 
is to find the optimal encoding (or signal states pi) for 
which the detector achieves its optimal performance. 

The problem just outlined bears strong similarities to 
the problem of quantum state discrimination [IHH] j where 
the encoding of Alice is fixed and Bob's task is to decide 
which message he has received by optimizing his mea- 
surement. In fact, we will see below that the two prob- 
lems can be regarded as dual to each other due to the 
symmetry that exists between the input ensembles and 
the POVM measurements. This allows one to adopt re- 
sults from quantum state discrimination to the problem 
at hand. However, since in quantum state discrimination 
the space over which we optimize is more constrained due 
to the completeness relation Ei = /, it turns out that 
in many cases the problem of optimal signal states for 
quantum detectors is easier to solve. 

Besides its apphcation for characterizing detectors, the 
problem considered here is of natural practical inter- 
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est for quantum communication, since generating differ- 
ent signal states ^ can be experimentally more acces- 
sible than performing different measurements. A quan- 
tum detector is usually fixed, while a preparation de- 
vice, although possibly also based on a fixed (but non- 
destructive!) measurement, can be used together with 
post-selection, which provides additional flexibility to the 
preparation process. Furthermore, in the case of commu- 
nication through a noiseless channel, any operation at the 
receiver's side prior to the detector can be equally done 
as part of the preparation strategy. 

In this paper, we consider the above problem from 
the perspective of three different information transmis- 
sion tasks — the task of optimal Bayes cost message dis- 
crimination (of which the well known problem of mini- 
mum error discrimination is a special case), unambiguous 
message discrimination, and the maximal mutual infor- 
mation. Due to the simplification mentioned above, we 
are able to provide solutions to the Bayesian and un- 
ambiguous discrimination problems in the general case. 
For the maximal mutual information, we show that this 
quantity is equal to the classical capacity of the quantum- 
to-classical channel corresponding to the measurement, 
which we term '''' capacity of the measuremenf\ This 
quantity provides a general figure of merit for the in- 
formation readout capabilities of a detector. Based on 
its relation to the accessible information [5], we prove 
a result similar to Davies's theorem [2] (Proposition 2), 
which shows that the optimal ensemble can be chosen to 
consist of (P pure states, where d is the dimension of the 
system. For a group covariant measurement, we obtain 
that the problem is equivalent to that of accessible in- 
formation of a group covariant ensemble of states. We 
apply our results to the case of a noisy two-level sym- 
metric informationally complete measurement, for whose 
capacity we give analytical proofs of optimality. 

II. THE BAYES COST PROBLEM 

In the Bayes cost problem, one is interested in mini- 
mizing an average cost function of the form 

C(P) = ^C,,i',„ (1) 

where Pij = TrliriPiEj) are the joint probabilities for in- 
put i and measurement outcome j, and Cij > are the 
elements of the cost matrix {Cij is the cost of choosing 
hypothesis j when hypothesis i is true). In what we will 
refer to as the straight version of this problem, one as- 
sumes that the encoding i pi is given, and the task is 
to find the measurement {Ej} that minimizes the quan- 
tity in Eq. ([T]) [1 . An example of a Bayes cost problem 
is that of minimum error discrimination, i.e., minimiz- 
ing the probability for incorrectly identifying the mes- 
sage. In this case, the probability for an error is given by 
Perr = X^j^j ^ij^ ^-^-^ elements of the cost matrix are 

Cij — 1 Sij . 



Here we are concerned with the opposite scenario 
which we will refer to as the reverse problem: we as- 
sume that the receiver has an apparatus that implements 
a particular POVM measurement, and we ask what the 
optimal way to encode the classical messages into quan- 
tum states is so that, using only the given POVM mea- 
surement and possibly some side information processing, 
the receiver will identify the message at the lowest cost. 
This side information processing involves finding the op- 
timal way of choosing hypothesis i when the measure- 
ment outcome k takes place, and includes the possibility 
of following a mixed strategy, i.e., assigning a hypothesis 
i randomly according to some prescribed probability dis- 
tribution, which might of course depend on the outcome 
k. In other words, the receiver can use the given POVM 
{Ek} to obtain new POVM measurements with elements 
of the form 

4 ^p(j|fc) = lforanfc, (2) 

k j 

where < p{j\k) < 1 are conditional probabilities. 

Up to renormalization of the cost matrix, we can as- 
sume that < Cij < 1. Hence, the problem is equivalent 
to that of maximizing the quantity 

B{P) EE 1 - C{P) = ^(1 - C.,)P., EE ^ B,,P,„ (3) 
ij ij 

where 

0<B,,<1, \fi,j. (4) 

For a given POVM measurement {Ei}, consider some 
encoding and decoding strategies given by the map i ^ pi 
and the conditional probability distribution p{j\k), re- 
spectively. For these strategies, the quantity B{P) reads 

BiP) = B,j7r,p{j\k)TT{p,Ek). (5) 

ijk 

Define j*{k) to be a value of j for which the quantity 
BijTTiTr^piEk), for a fixed fc, is maximal (if there are 
two or more such values, we can pick any one of them). 
Then, 

B{P) < B,j,(^k)T^,TT{p,Ek), (6) 

ik 

which is achievable by choosing p{j\k) — 5jjt(^k)- 

We see that for the purpose of achieving the maximum 
m Eq. (|3|, the receiver does not need a mixed strat- 
egy, i.e., the maximum can be achieved by choosing all 
conditional probabilities p{i\k) to be either or 1. This 
means that the receiver can associate more than one mea- 
surement outcome E^ with the same hypothesis j, but it 
does not help to associate two or more hypotheses with 
the same outcome. Note that this means, in particular, 
that in the case when the number of possible messages 
iV is greater than the number M of different outcomes 
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of the POVM, the best strategy is not to attempt to de- 
tect certain messages at aU. In fact (see below), even 
when AI < N, it may be advantageous to group different 
POVM elements for the detection of a single state. 

Let Kj denote the set of those indices k for 
which j*{k) = j, i.e., the indices k for which the out- 
comes Ek are associated with hypothesis j. Note that 
the sets Kj are non-intersecting as shown above and that 
some sets may be empty. In other words, the set of pos- 
sible assignments corresponds to that of all possible ways 
to distribute M elements into N groups {Kf}^=i, where 
the index a labels each of the iV*^ distributions. Then 
for any such choice we have 

B^{P) = max^ ^,Tr(p, ^ B,,S;), (7) 

i 3 

where E^ = X]fc6K° -^k ■ The maximum of this quantity 
is achieved when each of the signal states pi is chosen 
to be an eigenstate corresponding to the maximal eigen- 
value of the operator BijE", which we will denote by 
X^^^{J2j BijEf). Hence, we can write 

i?(P) = max TT • Sq, (8) 

a 

where we have defined the vectors 7r = {7ri,...,7rjv} and 

3 3 

We thus see that the problem reduces to that of finding 
the sets Kj for which the quantity in Eq. Q is maximal. 
The corresponding partition specifies which outcomes k 
of the POVM measurement the receiver has to associate 
with a given classical message j. The optimal encod- 
ing strategy is to encode each classical message i into 
an eigenstate pP^"'^ corresponding to the maximal eigen- 
value of J2j BijE'j (note that these states can always be 
chosen to be pure). 

In general, the optimal grouping a* of POVM ele- 
ments, a* — argmax^TT • Sq, will depend on the given 
priors tt. The region in the corresponding simplex where 
one particular grouping is optimal defines a polytope, or, 
more precisely, a convex polytope when restricted to the 
region tti > 7r2 > . . . > Tr^f (throughout the paper this 
ordering of prior probabilities will be always assumed), 
i.e., if TT • (sq* — Sq) > and tt' • (sq. — Sq) > 0, then for 
< p < 1 one has [p-K + (1 — p)7r'] • (s^. — Sq,) > 0. 

The described optimization procedure involves calcu- 
lating and comparing a finite set of quantities. In con- 
trast, the straight version of the problem in the general 
case is a linear program that requires maximization over 
a continuous set. Even though the task of finding the 
optimal encoding for a given decoding POVM exhibits 
an apparent similarity with the problem of finding the 
optimal POVM for a given encoding (see the symmetry 
of the cost function (IT]) with respect to interchanging the 



POVM elements and the input states) , an important dif- 
ference between the straight and reverse problems is that 
the quantities over which we maximize in the straight ver- 
sion have to satisfy the constraint Ej ~ /, whereas 
in the reverse case there is no constraint on the signal 
states Pi. 

Observe that in the case when N < M, the above 
optimal strategy requires at least one of the messages 
to be associated with multiple measurement outcomes. 
However, as mentioned earlier, even in the case when 
N > M, it may be advantageous to associate more than 
one outcome of the POVM with the same state. For 
example, in the problem of minimum error discrimina- 
tion, two POVM elements may have very similar (or even 
identical) maximal eigenvalues and corresponding maxi- 
mal eigenstates, but all prior probabilities of the differ- 
ent input messages may differ significantly. Then it is 
not difficult to see (see examples in the last section) that 
associating the two measurement outcomes in question 
with two different messages would be worse than asso- 
ciating both of them with one of the messages — the one 
that has a higher prior probability. 

Note that the special case of minimum error discrimi- 
nation with a given POVM has been previously studied 
in Ref. [TT] as part of the problem of optimal encoding 
of classical information in a quantum system for min- 
imal error discrimination when both the encoding and 
the measurement can be optimized. However, the solu- 
tion provided in Ref. [TT] for a fixed POVM is not truly 
optimal since it assumes that different outcomes must be 
associated with different states. 

We remark that in certain cases it may be possible 
to simplify the general procedure described above. For 
example, in the problem of minimum error discrimina- 
tion, when the prior distribution is fiat, tt^ = 1/iV, 
i — 1,...,N, and M < N, all we need to do is en- 
code M of the N different possible messages into the 
eigenstates corresponding to the maximal eigenvalues 
of the different POVM elements. In this case, asso- 
ciating multiple measurement outcomes with the same 
message does not help since {l/N)X^^^^{Ej + Ek) < 
{l/N)X^''^iE,) + (l/Ar)A™^^(i;fc). 

For a binary source, the minimum error probability can 
be written in a particularly simple form. In this case, the 
POVM grouping is {E°',I—E°'}. We start discussing the 
unbiased case (i.e., tti — tt2 = 1/2) for which 

max i[Tri^"pi +Tr(/-i?")p2] 

{pi,P2} 2 

= max Tr^"(pi-P2)]. (9) 

^ {Pl,P2} 

The maximum occurs when pi and p2 are the states 
corresponding to the largest and lowest eigenvalue of E"' , 
respectively. The difference between these two values is 
known as the spread of a matrix, defined for a generic 
matrix A as Spr(A) = max^j |Ai — Xj\, where A.; are the 
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characteristic roots of A [TU] . Hence, 



1 + maxSpr(i;° 



(10) 



Notice the resemblance with the weU known Helstrom's 
state discrimination formula [T] , where the trace-distance 
has been replaced by the (semi-norm) spread. 

From Eq. ([s]), the success probability for arbitrary 
priors reads 



Ps 



max{7riA'"^^(S") + n2[l - A™"(£;")]}. (11) 



It is clear that when one signal is given with a prior 
probability larger than the success probability attained 
by a two-outcome POVM {E, I — E}, it pays to assign all 
outcomes to the most probable signal. In other words, 
the measurement does not add information to our prior 
knowledge, and the optimal grouping results in the trivial 
POVM {/, 0}. The transition occurs at tti = ps. More 
explicitly, the trivial POVM is optimal if 



1 - A™"(£;) 



[1 - A"""(£;)] + [1 - A«>a'^(£;)] 



(12) 



Notice that if X'°^^^{E) = 1, it is always advantageous 
to perform the measurement, irrespectively of the prior 
probabilities. 



III. UNAMBIGUOUS MESSAGE 
DISCRIMINATION 



as for the Bayes cost problem, i.e., {Ei} can be taken 
to be sums of certain subsets of the original POVM ele- 
ments. This is so because there is no way one can unam- 
biguously identify two or more messages that have been 
associated with a given Ei if the corresponding outcome 
takes place. (If some outcome i occurs with zero probabil- 
ity, we can add Ei to any of the elements Ei, E]\r, E-/, 
as this would not change the probabilities of the respec- 
tive outcomes.) Similarly, if Ek is randomly associated 
with both a given message i [i.e., < p{i\k)] and the 
inconclusive answer [i.e., < p(?|fc)], the probability of 
success would increase with the choice p(i\k) — 1. 

Thus, for the unambiguous discrimination of N in- 
put states Pi, i = 1,...,A^, each occurring with prior 
probability tt^, consider some grouping of the original 
POVM elements into N + I elements, E^ , E%, Ef , 
where, as in the previous section, a labels the vari- 



ous grouping possibilities. Condition (13) requires that 



g Xf = njl^jkeriJ" for each i. Conversely, if each 
Pi is chosen to belong to this intersection (assuming it is 
non-empty), then unambiguous discrimination would be 
possible with probability 



N 

E 



TT^TriEy,). 



(14) 



Let P." denote the projector on JCf. Note that this pro- 
jector can be easily computed because E^ > implies 

that 3Cf = keY{J2f^^Ef). Since p, = P^p^P^, Eq. ^ 
can be written as 



Unambiguous quantum state discrimination [3HS1 [TJ H] 
concerns the task of identifying which out of a set of pos- 
sible states one has received in such a way as to ensure 
no error whenever a conclusive answer is given. In gen- 
eral, such conclusive answers cannot always be given, and 
the problem consists in maximizing the probability with 
which they occur. 

Let {Ei} be the POVM the receiver has been pro- 
vided with and let us allow, as in the previous section, 
some side informationprocessing that will result in new 
POVMs, Ej [see Eq. M]. For the purpose of unambigu- 
ously identifying a given set of messages i — l,...,iV, 
encoded in the quantum states pi, i = 1,...,A^, these 
POVMs must consist of iV-|- 1 elements: Ei, Ej^, rep- 
resenting the conclusive answers, and an additional ele- 
ment E-f that represents the inconclusive one. It must 
hold that 



Tmpj] 



(13) 



since errors are not allowed in conclusive answers. Any 
of the elements Ei, En , E-? can be the zero operator 
as a special case. 

One can readily see that all the conditional proba- 
bilities p{j\k) that define {Ei} in terms of the original 
POVM through Eq. ^ can be taken to be either or 1, 



N 

^^,Tr 

i=l 



[P^EIP^P: 



(15) 



and we can maximize each of the traces by choosing pi to 
be an eigenstate of P°'EfP°^ with maximal eigenvalue. 
Let us denote this eigenvalue by A™^^(Pf i?f i^"). Then, 
we have 



N 



(16) 



i=l 



where, as before, tt — {tti, 7r2, . . . , tt^v}, and s'^ = 
|;)^max(pa^apa)^ _ ;^max(paj^apa)| -^^ decreasing 

order of value (this, actually, defines the labeling of the 
POVM elements -E"). Note that this ordering ensures 
maximization of the overlap tt • s^. The probability of 
success of the optimal message discrimination protocol is 



Ps 



maxp^ 



max TT • 



(17) 



Here a takes (iV -I- 1)^/7V! different values, namely, the 
number of different ways of distributing M POVM ele- 
ments in A^+1 sets, where the sum of the elements in each 
of these sets are , . . . , i?^, E^ respectively (TV! takes 
into account the specific labeling defined above). Note 
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that certain sets may be empty, i.e., we allow some of 
the new POVM elements to be the zero operator (the 
corresponding message will never be identified in these 
cases) . 

To compute ps we may consider the following proce- 
dure. Pick a grouping a and construct each of the pro- 
jectors P" on the intersection 3Cf for i = 1, 2, . . . , iV. If 
some %f is empty, terminate the calculation and consider 
a different grouping a' . If there is an empty intersection 
for all a, the problem does not have a solution (other 
than the trivial E-? — /), which means that the given 
POVM {Ei} cannot be used to unambiguously discrimi- 
nate N messages. For each grouping such that 3Cf ^ 0, 
i = 1, 2, . . . , iV, compute Sq, and pick up the one, a*, that 
maximizes (17). Optimal detection is attained with the 
POVM measurement {Ef , ...,E'^ ,Ef } and the opti- 
mal encoding of each classical messages i is provided by 
an eigenstate pi of P" Ef' P"' with maximal eigenvalue 
(note that the states pi can always be chosen to be pure). 

The above solution to the reverse unambiguous dis- 
crimination problem works for any POVM. In contrast, 
there is no known solution to the straight version of the 
same problem for an arbitrary ensemble of mixed input 
states (see, e.g., Ref. [8]). As in the case of minimum er- 
ror discrimination, there are certain similarities between 
the problem of finding the optimal encoding for a given 
POVM and that of finding the optimal POVM for a given 
encoding: for the latter, the POVM {Ei} have to be cho- 
sen such that Ei e ker pj , which resembles the con- 
dition Pi e ker Ej in the reverse problem. Further- 
more, in the two problems, one has to maximize the same 
quantity, Eq. ( 14 ) , where states and POVM elements 



play essentially the same role (they are interchangeable) . 
Recall, however, that in the straight case optimization 
has the additional constraint J2f ^ which makes 
the problem more difhcult. 



IV. MUTUAL INFORMATION 

The problems considered in the previous sections char- 
acterize the ability of a POVM measurement to perform 
certain information readout tasks (e.g., minimum error 
discrimination or unambiguous message discrimination) 
with respect to a given source of classical messages de- 
scribed by the prior probabilities {tt,}. These results 
are strongly dependent on the source. For example, if 
the source consists of only a single message, each of the 
tasks can be accomplished with unit probability using 
any measurement. Such a source, however, is trivial as 
it contains no information. In this section, we consider 
a source-independent characterization of the ability of a 
measurement to extract information which is provided 
by the maximum mutual information that can be estab- 
lished between the sender and the receiver over all pos- 
sible sources and suitable encodings at the sender's side 
for the given POVM measurement at the receiver's side. 



Consider an information source characterized by the 
probability distribution {TTi}, i = 1,...,N, and an en- 
coding i Pi. The joint probabilities of the input mes- 
sages and the outcomes of the POVM measurement {Ej}, 
j = 1,...,M, are 



(18) 



The mutual information between the input and the out- 
put is given by 



j \ i / ij 



(19) 

where r](x) — —a; log a;. 

We will be interested in the maximum of /(P) over all 
possible source distributions {ni} and encoding strate- 
gies i ^ Pi, that is, over all input ensembles {tt^, p^}. 



C7({i?,}) = max /(P). 



(20) 



Note that, according to the data processing inequal- 
ity, post-processing of information at the receiver's side 
cannot increase the mutual information, so in this case 
it cannot help to group POVM elements (or randomize 
outcomes) . 

As shown by the following proposition, C{{Ei]) has a 
natural interpretation as the capacity of the measurement 
{Ei{ which for all practical purposes can be modeled by 
a quantum channel of the form £{p) = Tr(pi?j)|j)(j|, 
where {|j)} are orthogonal states that carry the classical 
information about the outcome of the measurement. 

Proposition 1. C{{Ei}) is equal to the classical capac- 
ity of the channel 



8{p) = Y,Tv{pE,)\j){j\. 



(21) 



Proof. It is known [T21 [T3] that the classical capacity of 
a quantum channel M. over independent uses of the chan- 
nel (i.e., when no entanglement between multiple inputs 
to the channel is allowed) is given by the quantity 



x(Al) = max IS 



Y'KiM{p^ 



-Y,^,S[M{p,)]\ , (22) 



where S{p) = — Tr(plog p) denotes the von Neumann en- 
tropy of the state p. The general capacity of the channel, 
allowing possibly entangled inputs, is 



C{M) = lim 



X{M 



(23) 



where A^*^" denotes n uses of the channel. For 
entanglement-breaking channels [2], such as the 
quantum-to-classical channel £{p) above, it has been 
shown that the quantity x(f ) is additive |15H17j . in par- 
ticular x{£ ®£) = 2x(f), which implies that 



C{£) = x{£)- 



(24) 
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Furthermore, for any input ensemble {7ri,pi}, the chan- 
nel £{p) outputs an ensemble of commuting quantum 
states, {tTj, f (pi)}, and for such an ensemble it is easy to 
verify that the quantity S [^^i'^i^i.Pi)] ~ Hi'^i^l^i.Pi)] 
is equal to the mutual information in Eq. (19). The 
proposition then follows from the definitions (20) and 



A comment is in order here. The classical capacity of a 
channel is the maximum rate at which information can be 
transmitted reliably through the channel in the limit of 
infinitely many uses. Since the optimal measurement for 
extracting information from the channel £{p) is a projec- 
tive measurement in the basis which preceded by 
£{p) is equivalent to the POVM measurement {Ej}, the 
quantity C{{Ej}) is equal to the maximum rate at which 
information can be read reliably using the POVM {Ej}. 
Corollary 1. We have. 



C({£;,}) = max<^5 



(25) 



Observe that we can write the joint probability ( 18 ) 
in the symmetric form 



P,,^Tr{p,E,), 



(26) 



where pi = -KiPi are unnormalized positive operators 
satisfying Tr(^-pi) — 1. (Hereafter, we will use the 
notations {ni, pi} and {pi} interchangeably to denote 
an ensemble of states.) In this notation, C{{Ei}) ~ 
max{^^j. I{P). Notice further that the mutual informa- 
tion I{P) is symmetric with respect to the indexes i 
and j. Therefore, the problem we are considering can 
be regarded as dual to the one of accessible information 
of an ensemble of states {pi} [51 which can be written as 



A({p,}) = max/(P). 



(27) 



Note, however, that the two problems are not identical as 
the operators {Ei} satisfy a stronger constraint than the 
operators {pi}- J2i = ^- strict duality transforma- 
tion between signal ensembles and POVM measurements 
has been established in Refs. |l8l [19]. We will not be 
concerned with that correspondence here.) 

The above suggests that certain results in the study of 
the accessible information of an ensemble of states may 
prove useful for the study of the capacity of a measure- 
ment. For example, the symmetry of the problems and 
the difference in constraints implies 



C{{Ei\) > Am}), 



(28) 



where Ei — Ei/d. Therefore, any known lower bound 
of A can be used to obtain a lower bound of C. For 
example, the lower bound obtained in Ref. [20] yields 



C{{E,}) >Q\Y. m,E, - J2 m^Q{Ei), 



(29) 



where = Tr{Ei)/d, Ei = Ei/{mid), and Q{p) is the 
subentropy of a density matrix p, which in terms of the 
eigenvalues of p reads [5D| 



E n 



At. 



Ai. — Ai 



Afc log A;; 



(30) 



(if two or more eigenvalues are equal, one takes the limit 
as they become equal). 

Similarly, one may wonder if the Holevo quantity 
S{J2i^i^i) ~ Y^i''niS{Ei) [2T], which provides a sim- 
ple upper bound on the accessible information A{{Ei}), 
could also provide a useful bound for the capacity 
C{{Ei}). As we will see below, however, this quantity 
is neither an upper nor a lower bound to C{{Ei}). 

Proposition 2. The maximum in Eq. ( 20 ) can be 
achieved with an ensemble of pure input states pi — 
\il)i){il)i\. Furthermore, the number N of input states can 
be made to satisfy d < N < d^. 

This proposition is similar to Theorem 3 in Ref. [2], 
where it is shown that for a given ensemble of input 
states, the optimal POVM measurement can be taken 
to have rank-one POVM elements whose number M sat- 
isfies d< M <d'^. 

Proof. As noted in Ref. [5], I{P) is a convex func- 
tion over the convex set of A'^ x M probability matrices 
P with fixed row sums. By a similar argument, I{P) is 
a convex function over the convex set oi N x M proba- 
bility matrices P with fixed column sums. This implies 
that if P' is a {N — 1) x M probability matrix obtained 
from P by replacing two rows by their row sum, then 
I{P') < I{P)i with equality when the two rows are pro- 
portional. Therefore, for any input ensemble {TTi,Pi}, 
where pi = J2kPik\''Pik){'4'ik\, we can consider the pure- 
state ensemble {7TiPikT\ipik){'4'ik\} which has mutual in- 
formation with the output no less than that of {wijPi}. 
(Note that we can assume that no two states |'i/'ifc)(V'ifel 
are identical, since if they are, we can combine them into 
a single state with prior probability equal to the sum of 
their prior probabilities, which does not change the mu- 
tual information.) Hence, the maximum in Eq. (20) is 
attained for an ensemble of different pure states. 

Next, observe that Eq. (20 1 can be written as 



C({£;j)=max max I{P), 

P {TTi.lAilp 



(31) 



where the left maximization is over all density matri- 
ces p, and the right maximization is over all ensembles 
{iTi,ipi}p of pure states ipi = \'ipi){ipi\, whose averages are 
equal to p, J2i T^ili'i) {ipi\ = P- (We note that the quantity 
max{^^ I{P) for a fixed p has been previously consid- 
ered in relation to methods for obtaining bounds on the 
mutual information jl9).) Following closely the proof in 
Ref. [2], we will show that for any p, max{^. ,^.} I{P) can 
be achieved by an ensemble of at most states. Indeed, 
the latter maximization is equivalent to a maximization 
over the convex set Y of probability distributions with 
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finite support on the set of pure states, whose average is 
equal to p. Note that the different ensembles {■Ki,il)i}p 
give rise to joint probability matrices P with fixed row 
sums equal to Ti[pEj)^ which according to the convexity 
property pointed out earlier implies that I{P) is a con- 
vex function on Y . Hence, the maximum is achieved for 
an extreme point of Y , which by Caratheodory's theo- 
rem can be shown to be a probability distribution whose 
support has < 1 -f dim^ points, where A is the convex 
set of density operators of which the pure states we are 
considering are extreme points. Since dimy^ — dP — 1, 
we obtain N < d? . 

To show that in general d < N, we will use the fact 
that for every d, there are certain types of POVMs for 



which the optimal p in Eq. ( 31 ) is full-rank (in particular, 
we will show below (Theorem 1) that when the POVM is 
covariant under the irreducible representation of a finite 
group, the maximum in Eq. ( 31 1 is achieved for p — I /d) . 
If we assume that d > N, there must exist a vector 
{tp\tp) = 1, such that = 0, Vi = 1, ...,N. But then 

= X^i ■"'■iKV'IV'i)!^ = Oi which is in contradiction 
with p being full-rank. □ 

We next consider the case of a group covariant POVM 
measurement, which is dual to the problem of accessi- 
ble information for a group covariant input ensemble [2] ■ 
For this purpose, we need to introduce some terminol- 
ogy. Let S denote the set of all states on a Hilbert space 
H of dimension d. Following Ref. [2], we will regard a 
representation i? of a group G as a homomorphism from 
G to the afhne automorphisms of S, where every such 
automorphism is representable in the form a{p) — UpW 
with U being a unitary or an antiunitary operator (we 
will consider the action of R automatically extended to 
all operators over "H by linearity) . A representation of G 
is irreducible, if the only G-invariant point of S is I/d. 

We will say that the POVM {Ej}, j = 1,...,M, is 
G-covariant if there exists a surjection f : G ^ 



where we denote f{g) := Eg, such that Rg{Eh) = E, 



Vg, h £ G. Note that every element Ej must equal Eg 
for at least one g ^ G, but this correspondence may be 
degenerate, i.e., a given Ej may be associated with two 
or more elements of the group. The fact the G is a group 
implies that this degeneracy must be the same for every 
element Ej, and hence M must be a factor of |G|. 

Theorem 1 (The group covariant case). If the POVM 
{Ej} is covariant with respect to the finite group G that 
has an irreducible representation R on 5, then there ex- 
ists a pure state (V'lV') — li such that the maxi- 
mum in Eq. ( 20 ) is achieved by the covariant ensemble 
of pure input states {|G|~-^, i?*(|?/')('0l)}i where |G| is the 
number of elements of G, and R* denotes the represen- 
tation of G dual to R. The capacity of {Ej } is 



G({£;,})=logd + M-id^(^l-^l^) log 



Tr^;. 



1^ 



(32) 

Proof. Let {tt^, V'i} be an ensemble of pure input states 
that maximizes the mutual information for the given co- 



variant POVM measurement {Ej{. Construct a new in- 
put ensemble {TXig^ij^ig}, where 

■^iff = ^glV^Oi and TTig = 7ri|G|"^ (33) 

The new probability matrix P obtained using this en- 
semble has the form 



P=|G|-i 



P2 



V P\G\ J 



(34) 



where each of the probability matrices Pi, P2, ^"101 
is obtained from P by a permutation of the rows and 
columns of P, and the column sums of P are all equal 
to |G|^^. A straightforward calculation shows that the 
new probability matrix yields a value for the mutual 
information which is no less than that obtained for P, 
i.e., /(P) > /(P): 

/(p) ^ Jy: pj +e ^ (e p) -e ^(^»^) 

= |GiE'/EiGr^+E^(|Gn 

-\G\J2vi\G\-'P.,) 
Yim,)+\og\G\-' 

ij 

^e^(e^^.]+E'^(e^.) 



Now, consider the covariant input ensembles {|G| ^, ipl}, 
where 



(36) 



Let us denote the probability matrices that each of these 
ensembles yields by Pj. Since the ensemble {liigjipig} is 
a convex combination of the ensembles {|G|~^,-0^}, and 
the mutual information is a convex function of the input 
ensemble, we obtain 



/(P) < /(P) < max/(P,), 



(37) 



i.e., the maximum in Eq. (20) is achieved for one of 
the covariant input ensembles~[|G|~^, ipg} which has the 
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form stated in the theorem. The value of the capacity 
[Eq. ( 32 )] is obtained by a straightforward calculation, 



that the maximum in Eq. (20) is achieved for an en- 



taking into account the possible degeneracy in the corre- 
spondence between the group elements and the POVM 
elements. □ 

Notice that since the average of a group covariant en- 
semble is G-invariant, from the irreducibility of R it fol- 
lows that J2g \^\~^'i'g = ^/d- This shows that indeed 
for every d there are POVM measurements that require 
at least d optimal input states as argued in the proof of 
Proposition 2. 

Comment. The optimal "seed" may be such 

that the input ensemble {\G\~^ , R*g{\'ip) (ipl)} contains 
identical states, i.e., it may be that = 
R*h{\tp) {ip\) for certain g ^ h. The fact that G is a 
group implies that each maximal set of identical states 
in the ensemble must contain the same number of el- 
ements (and hence the number N of distinct states in 
the ensemble must be a factor of |G|). It is straight- 
forward to see that the ensemble {N~^ ,\tpi){tpi\} ob- 
tained from {|G|~^,i?*(|V')(V'l)} by identifying the iden- 
tical states and redefining their probabilities as the sum 
of the original probabilities, is also optimal. This is be- 
cause the joint probabilities resulting from the input en- 
semble {N~^, \4'i){'4'i\} can be transformed into those re- 
sulting from {\G\~^ , R*g{\ip) (ipl)} by local postprocessing 
on the sender's side, which cannot increase the mutual 
information. Hence, the number of states in the opti- 
mal ensemble in general may be smaller than |G| (just 
as the number of outcomes of a group covariant POVM 
may be smaller than |G|). This is the case, for exam- 
ple, with the optimal ensemble for the two-dimensional 
SIC-POVM studied in Section [VCj which has 4 elements 
while the symmetry group has 12. 

Corollary 2. In the group covariant case, we have 



Cm})^A{{E,}). 



(38) 



Moreover, if the POVM measurement {Fj} optimizes the 
mutual information for the input ensemble {Ei}, the in- 
put ensemble {Fj}, where Fj = Fj /d, optimizes the mu- 
tual information for the measurement {Ei}. 

Since under this symmetry the problem is equivalent to 
that of accessible information of a covariant input ensem- 
ble, any known results in the latter case can be applied 
here (see, e.g., Ref. [5]). In particular, in Section VC 



we calculate the capacity of the two-dimensional SIC- 
POVM. 

Another important case in which calculating the ca- 
pacity of a measurement reduces to a well known prob- 
lem is that of a POVM {Ei} with commuting ele- 
ments, [Ei,Ej] = 0, Vi,j. In this case, we can assume 
that the optimal signal states pi are diagonal in the eigen- 
basis of {Ej}, since for any p, the state p' — diag(/9„„), 
where p„„ are the diagonal elements of p in the eigenba- 
sis of {Ej}, yields the same values for the joint proba- 
bilities Tr^pEj). Furthermore, as we saw in the proof 
of Proposition 2, the optimal input ensemble can be 
taken to consist of the eigenstates of all pi, which means 



semble of input states which are the common eigenbasis 
of {Ei}. Hence the joint probabilities are P^- — iTiX'j, 
where A^- is the i-th eigenvalue of Ej, and the problem 
reduces to finding max^^.j I{P) which is the capacity of 
the classical channel described by the conditional prob- 
ability matrix p{j\i) = A*. Note that a measurement 
with two outcomes necessarily has commuting POVM 
elements, i.e., the capacity of a two-outcome measure- 
ment is always equal to the capacity of a classical chan- 
nel with a binary output. Thus, for example, the capac- 
ity of a two-outcome qubit measurement that has ele- 
ments El = diag(a, E2 = diag(l — a, 1 — /3) in some 
basis can be obtained from the formula for the capacity 
of a general binary channel |31j 



aH{(3) - (iH[a) 
13 ~ a 

H{a)-H{(3) 



log 



1-l-exp ■ 



(39) 



where H{q) = -qlogq - (1 - g)log(l - q), q e [0,1], 
is the entropy of a binary source. The optimal prior 
distribution in this case is {p, 1 — p}, where |31) 



(/3-a) 



l+expM^I^ 



(40) 



We can now see that, as mentioned earlier, the naively 
constructed Holevo quantity S{J2i "^iFi) — "n^iSiEi) 
where rrii = Tr(Ei)/d, Ei = Ei/{mid), in general is nei- 
ther an upper nor a lower bound to C{{Ei}). Indeed, it is 
known that the accessible information of an ensemble of 
density matrices is equal to the Holevo quantity of the en- 
semble if and only if all density matrices in the ensemble 
commute, and the maximal value of the mutual infor- 
mation is attained for a projective measurement in the 
common eigenbasis of the input ensemble. From the sym- 
metry of the problem we see that for a POVM with com- 
muting elements, the quantity S{J2i i^iS{Ei) 
is equal to the mutual information between the equiprob- 
able input ensemble of common eigenstates of {Ei} and 
the outputs of the measurement {Ei}. However, from 
Eq. (40) it can be seen that an equiprobable prior 
distribution is generally suboptimal for this case, i.e., 
the quantity Si^iTriiEi) — ^^miS{Ei) can be strictly 
smaller than G({i?,;}). On the other hand, in the group 
covariant case we have C{{Ei}) = A{{Ei})^ where in 
general A{{Ei}) is strictly smaller than S{'Y^- rriiEi) ~ 

We remark that the maximal possible mutual informa- 
tion for an input ensemble of states on a Hilbert space of 
dimension d and any POVM measurement is log d. This 
can be easily seen from Holevo's upper bound on the 
accessible information [21]. Moreover, this quantity is 
achievable only by an ensemble of pure commuting input 
states that sum up to the maximally mixed state, i.e., 
by an equiprobable ensemble of orthogonal basis states. 
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The unique optimal measurement for such an ensemble is 
a projective measurement on the basis in question. Re- 
versely, any rank-one projective measurement has capac- 
ity logd which is achievable by the equiprobable input 
ensemble of corresponding basis states. Hence, rank-one 
projective measurements have the highest capacity. 



V. EXAMPLE: THE SIC-POVM ON A QUBIT 

In this section, we apply the above results to the case 
of a symmetric informationally complete (SIC) POVM 
on a qubit, as well as to a noisy, or unsharp, ver- 
sion of this POVM. A SIC-POVM [H in dimen- 
sion d consists of a set of rank-one positive opera- 
tors, Ei = {l/d)\ipi){tlji\, where the pure states jV'i) are 
such that \{ipi\tpj)\'^ = 1) for i ^ j. The mea- 

surement is called "complete" in the sense that its statis- 
tics is sufficient for the full tomography of any quantum 
state [211 El] • SIC-POVMs are of particular interest due 
to their various applications in quantum information, in- 
cluding quantum tomography |26j . quantum cryptogra- 
phy [37] , and the foundations of quantum mechanics [5S] . 

Up to a change of basis, the POVM elements of such 
a measurement for d — 2 can be written as 



1 



■a), ^ = 1,2,3,4, 
where a is the vector of Pauli matrices 



(41) 



1 



-i 

1 



1 
-1 / ' 



and 



(42) 
















, 714 = 


1 
V3 





(43) 



In order to illustrate the relation between the "sharp- 
ness" of a measurement and its ability to read out in- 
formation, we will consider a more general, noisy version 
of the above SIC-POVM, where each outcome is mixed 
with some amount of white noise, 



E,{e) = e£;, + (1 - e)^ = i(/ + en, ■ a), 

i = l,2,3,4, 0<e<l. 



(44) 



When e = 1, the measurement reduces to the ideal SIC- 
POVM [Ei{l) = Ei], while as e ^ 0, the measurement 
becomes infinitesimally weak [29j . approaching a trivial 
measurement, each of its outcomes occurring with proba- 
bility 1/4 independently of the input state. In this sense, 
e can be regarded as parameterizing the "sharpness" or 
"strength" of the measurement . 



A. Minimum error discrimination 

For simplicity, let us start with the noiseless SIC- 
POVM (41 1. Given the symmetry of the problem, it is 



enough to consider four groupings, a € {A, B, C, D}: 

A : {Ei,E2,E3, E4} 
B : {El -t- i?2, E^, E4, 0} 
C ■.{Ei+E2,E3 + Ei,0,0} 
D : {Ei+ E2 + E3,E4,0,0}. (45) 

The corresponding vector of maximum eigenvalues (in 
decreasing order of value) are [see Eq. Q with Bij = Stj] 

SA- {1/2, 1/2, 1/2, 1/2}, 
SB = {(l + l/V3)/2,l/2,l/2,0,0}, 
sc = {(1 + l/\/3)/2, (1 + l/V3)/2, 0, 0}, 

= {1,1/2,0,0}, (46) 

where it is understood that the vectors need to be 
padded with extra zeros if the number of signal states 
exceeds four {N > 4). For equiprobable signals, tTj = 
1/iV, the optimal success probability is given by Ps ~ 
l/Nuiaxa X]i=i(^a)i- particular, ps — 1/2 -I- l/(2\/3) 
for N ^2, Ps ^ 1/2 + 1/(6^3) for = 3, and = 2/N 
for N > A, which are attained by the groupings C, B 
and A, respectively. That is, for four signals {N = 4) 
no grouping is necessary and the signal states have to be 
chosen to point along the directions of the SIC-POVM 
(43 1. Any additional signals {N > 4) can be assigned to 



arbitrary states and will never contribute to the success 
probability. For A^ = 3 one has to group two POVM 
elements leading to un unsharp effective measurement, 
and leave the remaining two outcomes ungrouped (i.e., 
sharp). In that case the three signals lie on a plane: 
two signals point along, say, ni and n2 (corresponding to 
the sharp POVM elements), and the third points along 
— (ni + n2)- For N — 2 the optimal strategy is two en- 
code the signals into orthogonal states pointing along the 
directions resulting from pairwise groupings, e.g., fii +n2 
and fis + n4 = — (ni -I- n2 ) . 

In Figure [1] we show the optimality regions for A = 3 
and different priors. Within the region tti > > 773, 
delimited by a dashed outline in the figure, we observe 
that the set of points where each particular grouping is 
dominant is a convex polytope. The corresponding max- 
imum success probabilities are: 



1 



1 



Ps 



2 2V3 

pf = TT1 + K2. (47) 

Note that regions C and D correspond to groupings 
where no outcome is assigned to the third signal state. 
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Figure 1: (color online) The colored regions in the prior- 
probability simplex for N = 3 indicate the various op- 
timal groupings. The dashed outline delimits the re- 
gion TTi > 7r2 > TTs. Within this region, the intersection point 
of A, B and C is P = {2^ - 3, 9 - 573}. Auxihary thin 
lines are drawn to help understanding the figure. 



This illustrates the fact that there are cases (regions C 
and D) where it pays not to assign any measurement- 
outcome to some of the messages (i = 3 in this exam- 
ple), even though the source emits them with non-zero 
prior probability. In particular, if the source is strongly 
biased towards one message (tti > l/VS, in this exam- 
ple), all but one measurement outcome will be assigned 
to it (message i = 1). 

In order to study the effect of noise, e < 1 in (44 1, 
one proceeds along the same lines as above. We first 



note that since the noise is isotropic, the optimal signal 
states, i.e., the eigenvectors with maximum eigenvalue of 
the sums of POVM elements in each grouping, A to D, 
are the same as those for the sharp case, thus indepen- 
dent of the sharpness parameter e. Their corresponding 



maximum eigenvalues in ( 46 ) now have a noisy compo- 
nent that scales with the number k^i of elements in those 
sums. More precisely, the vectors of eigenvalues have now 
components e (sq)^ -|- kai{l — e)/4. 

For equiprobable signals, tt^ = l/N, the optimal group- 
ings are those that are optimal in the sharp case. Thus, 
they do not depend on e, only on the number N of input 
states. The minimum errors are now: ps = l/2 + e/(2\/3) 
for N = 2, Ps ^ 1/3 + e(l + l/\/3)/6 for iV = 3, and 
Ps = {l + e)/N for > 4. 

In more generic cases, when the source emits symbols 
with arbitrary prior probabilities, the regions of optimal- 
ity do depend on the noise or sharpness parameter e. 
For the case of ternary sources, TV = 3, it is straightfor- 
ward to show that the overall structure of the optimality 
regions is that in Figure [T] but the point P{e) where 



B, C and D intersect, moves monotonically away from 
P(l) = P = {2^3 - 3, 9 - 5%/3} (when the POVM is 
sharp) to P(0) — {1/3,1/3} (when it is maximally un- 
sharp) . 



B. Unambiguous discrimination 

We now turn to unambiguous discrimination with the 
SIC-POVM on a qubit. Clearly, the slightest amount 
of noise (e < 1) will ruin any possibility of performing 
unambiguous discrimination since any signal state can 
trigger each of the outcomes with a non-zero probability. 
We thus concentrate on the ideal sharp SIC-POVM. In 
a two-dimensional Hilbert space one can only hope to 
unambiguously discriminate two states [N = 2; tti < 1), 
hence grouping A can be excluded as it has too many 
outcomes. Moreover, we need only to consider groupings 
B and D, since only they have at least one rank-one 
POVM element and have, therefore, a non empty kernel 
(3C" ^0). If grouping B is used, two messages can be 
unambiguously identified by choosing the signals in the 
kernels of and respectively, that is pi = (1 — na • 
cj)/2 and p2 = {1 — ■ a)/2, so that outcome 4 can only 
be triggered by pi and outcome 3 by p2 (i.e., Ei = E4, 
E2 = E-i and E> = Ei+ E2). This leads to a probability 
of successful identification given by 



Ps = 7riTr(£'ipi) 

1 — ^4 • 713 



7r2Tr(i;2P2) 
1 

3' 



(48) 



which is independent of the prior probabilities {tti, 1^2}- 
Proceeding along the same lines, one finds that for 
grouping D one can only unambiguously identify the 
state pi = {1 + n4 ■ a)/2 with Ei = E4, by excluding 
P2 = (1 — ^4 • (?)/2 (i.e., p2 e keri?i), while all other 
outcomes of the original POVM will be necessarily in- 
conclusive (E-f = I — E4). Obviously, no outcome will 
be associated to message i — 2 {E2 = 0). The success 
probability is 



P. 



D 



7riTr(£'ipi) 



TTl 

2 ' 



(49) 



which beats that of grouping B for tti > 2/3. 



C. Mutual information 

The SIC-POVM on a qubit, including its noisy version, 
is covariant under the tetrahedral group (indeed, the tips 
of the Bloch vectors ( 43 ) corresponding to the POVM el- 



ements define the vertices of a tetrahedron). Therefore, 
according to Theorem 1 in Section |IV[ the mutual in- 
formation for this POVM is maximized by an ensemble 
of pure input states possessing the same symmetry. Its 
maximal value, i.e., the capacity of the measurement, is 
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given by Eq. ( 32 1 for a state ip from the optimal ensemble 



(all other states in the ensemble are obtained from ip by 
applying operators of the symmetry group, i.e., ip plays 
the role of a "seed" for the ensemble). 

Theorem 2 ( Capacity of the noisy two-level SIC- 
POVM). For every value of e e [0,1], the seed %p that 
maximizes expression ( 32 1 can be chosen such that its 



Bloch vector is anti-parallel to the Bloch vector of any 
one of the four POVM elements ( [44| ), i.e., v = —nj- The 
capacity of the (generally noisy) STC-POVM is 

1-e, 1-e l + e/3, 1 + e/3 , , 
C^e = l + ^log — +3^^1og^^. (50) 

This result, which applies to both the straight and re- 
verse formulations of the problem, is interesting on its 
own right. As far as we are aware, previous results (for 
e = 1) relied on numerical optimization [5]. Here we 
provide an analytical proof for < e < 1. 

Proof. Let us define 



hit) = ry 



1 + t 



We will first show that the following inequality holds 
for -1 < t < 1 and < e < 1: 



hiet) > a{e) + b{€)t + c{f)t^ = p,{t), (51) 



where 



a(e) = -[M-e) + 15Me/3)-4e/i'(e/3)], 



6(e) 



-3/i(-e) + 3/i(e/3) + 4e/i'(e/3)] , 



= Y^[3M-e)-3Me/3)+4e/i'(e/3)], 



and h! is the derivative of h with respect to its argument. 

We start by noticing the following relations: 

Pe(-l) = /i(-e), p,(l/3) = h{e/i), p:(1/3) = e/i'(e/3), 

(52) 

and 



- + 4hr2 ^ 



(53) 



where the equality is attained only at e = 0. The first 
three of them are immediate. The last one is not so 
obvious and can be proved as follows. The function 7(e) 
is concave in [0, 1] since 



7"(e) 



1 



2(1 -e)(3 + e)2 1n2 2 In 2 
e(3-h5e-he2^ 



2(1 -e)(3 + e)2 1n2 



< 0. 



Differentiating the expression of c(e) above we readily 
obtain 
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-3/i'(-e) + 3/i'(e/3) + -e/i"(e/3) 
o 



which vanishes at e = 0. Thus 7'(0) — 7"(0) = and 
7"(e) < if e > 0. Then, 7(e) must necessarily decrease 
for e > 0, which in turn implies that 7(e) has its unique 
maximum at e = 0. Since 7(0) = 0, Eq. (53 1 holds in the 
whole interval [0, 1]. 

We can now turn to proving (51 1. We assume that 
e > 0, since e = is a trivial case. If /(<) = h[et) — p^{t), 
then 

fit) = -2c(6) 



2{l + et) ln2 

It follows from this equation that there is only one value 
of t at which f"{t) vanishes. But using (53), we see that 
f"{t) > for t > 0. Therefore, f'{t) can only change 
sign at some io < 0- Hence, f{t) is convex in {to, 1] and 
concave in [— 1 , tg ) . It can have only one minimum in 
(to, 1], and according to the third relation ( [52| , it must 
be at t = 1/3. Using the second relation (52 1, we see 
that this minimum value is 0. Thus /(t) > if t G [tg, 1]. 
Because of the concavity of / in the other interval, we 
just need to check the value of / at the end point t — 
— 1 [by continuity we must have /(to) ^ 0]- The first 
relation (52) ensures that f{t) >0 also in [— l,to]- 

Now, using the inequality ( |51[ ), one can show that the 
mutual information for the POVM (44), 



/ = 1 



trE,{e) 



I 4 

1 - -^/i(ew-nj). 



is bounded as 



I 4 



= 1 



4a(e) + -c(e) 



1 - 



/i(-e) +3/i(e/3) 



This bound is attained with any one of the four choices 
V = —Uj. The value of the capacity (50) is obtained by 



a straightforward substitution. □ 
Note that in the minimum error scenario, the opti- 
mal signal ensemble is such that each state and its cor- 
responding POVM element have maximum overlap (i.e., 
they are aligned to each other). In contrast, here we 
find that it pays to have a signal ensemble where each 
state would be excluded by one of the POVM outcomes 
in the absence of noise (i.e., states and POVM elements 
are anti- aligned to each other). This configuration min- 
imizes the (average) conditional entropy of the output 
(the POVM outcomes) given the input signal ensemble 
[recall that the mutual information ( 32 ) can be obtained 



by subtracting this conditional entropy from the entropy 
of the output, which is constant here]. 
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As expected, the capacity attains its maximal value 
Ci = log 4/3 for e = 1 (the ideal SIC-POVM) and mono- 
tonically decreases towards as e approaches 0. Note 
that, as pointed out in Corollary 2, the capacity of such a 
group covariant POVM is equal to the accessible informa- 
tion of an equiprobable ensemble of states proportional 
to the original POVM elements, 



The latter problem, in the case e = 1, was studied in 
Ref. [2] where it was shown that the accessible informa- 
tion of the corresponding ensemble is A = log 4/3, which 
is equal to Ci. The capacity of the ideal SIC-POVM has 
also been previously obtained in Ref. [19] by a different 
approach. 

VI. CONCLUSION 

In summary, we have studied the problem of optimal 
signal states for information readout with a given quan- 
tum detector. We considered some of the most com- 
mon information transmission problems — the Bayes cost 
problem, unambiguous message discrimination, and the 
maximal mutual information. We provided solutions to 
the Bayesian and unambiguous discrimination strategies. 
We also showed that the maximal mutual information is 
equal to the classical capacity of the measurement and 
studied its properties in certain special cases. For a group 
covariant measurement, we obtained that the problem is 
equivalent to the problem of accessible information of a 
group covariant ensemble of states. As an example, we 
applied our results for the different discrimination strate- 
gies to the case of a SIC-POVM on a qubit, including a 
noisy version of that POVM. 

An interesting question for future investigation is if 
and under what conditions the optimal solutions pro- 
vided here are unique. Another question of significant 
interest would be to obtain an upper bound on the capac- 
ity of a measurement. We provided a lower bound which 
is obtained from a lower bound on the accessible infor- 



mation, but that lower bound could also be improved. 
It would also be interesting to investigate the continu- 
ity properties of the optimal quantities considered in this 
paper. For example, if two measurements are close in 
terms of the distance functions introduced in Ref. |32j . 
are their capacities also close? 

Finally, we note that the capacity of a POVM provides 
a very natural and source-independent means to give a 
quantitative characterization of a generalized quantum 
measurement. However, it cannot be used as the unique 
figure of merit against which measurement devices should 
be benchmarked. Ultimately, the performance of a given 
measurement apparatus strongly depends on the task it 
is meant to accomplish. For instance, a noisy Stern- 
Gcrlach measurement might have a higher capacity than 
that of an ideal SIC-POVM, however, it would be mis- 
leading to claim that such a Stern-Gerlach measurement 
outperforms the SIC-POVM since the latter can carry 
out tasks, such as full single-qubit tomography or un- 
ambiguous state discrimination, that are impossible to 
achieve with the former. 

Note added. Almost simultaneously with the posting 
of this paper, two concurrent works appeared — by M. 
DaU'Arno, G. M. D'Ariano, and M. F. Sacchi (Ref. [33]). 
and by A. S. Holevo (Ref. [51]) — which also introduce 
and study the capacity of a POVM measurement. 
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